article number
Interpreting deep learning-based stellar mass estimation via causal analysis and mutual information decomposition
Zhang, Wei, Lin, Qiufan, Ting, Yuan-Sen, Chen, Shupei, Ruan, Hengxin, Li, Song, Wang, Yifan
End-to-end deep learning models fed with multi-band galaxy images are powerful data-driven tools used to estimate galaxy physical properties in the absence of spectroscopy. However, due to a lack of interpretability and the associational nature of such models, it is difficult to understand how the information that is included in addition to integrated photometry (e.g., morphology) contributes to the estimation task. Improving our understanding in this field would enable further advances into unraveling the physical connections among galaxy properties and optimizing data exploitation. Therefore, our work is aimed at interpreting the deep learning-based estimation of stellar mass via two interpretability techniques: causal analysis and mutual information decomposition. The former reveals the causal paths between multiple variables beyond nondirectional statistical associations, while the latter quantifies the multicomponent contributions (i.e., redundant, unique, and synergistic) of different input data to the stellar mass estimation. Using data from the Sloan Digital Sky Survey (SDSS) and the Wide-field Infrared Survey Explorer (WISE), we obtained meaningful results that provide physical interpretations for image-based models. Our work demonstrates the gains from combining deep learning with interpretability techniques, and holds promise in promoting more data-driven astrophysical research (e.g., astrophysical parameter estimations and investigations on complex multivariate physical processes).
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- (10 more...)
WATSON-Net: Vetting, Validation, and Analysis of Transits from Space Observations with Neural Networks
Dévora-Pajares, M., Pozuelos, F. J., Suárez, J. C., González-Penedo, M., Dafonte, C.
Context. As the number of detected transiting exoplanet candidates continues to grow, the need for robust and scalable automated tools to prioritize or validate them has become increasingly critical. Among the most promising solutions, deep learning models offer the ability to interpret complex diagnostic metrics traditionally used in the vetting process. Aims. In this work, we present WATSON-Net, a new open-source neural network classifier and data preparation package designed to compete with current state-of-the-art tools for vetting and validation of transiting exoplanet signals from space-based missions. Methods. Trained on Kepler Q1-Q17 DR25 data using 10-fold cross-validation, WATSON-Net produces ten independent models, each evaluated on dedicated validation and test sets. The ten models are calibrated and prepared to be extensible for TESS data by standardizing the input pipeline, allowing for performance assessment across different space missions. Results. For Kepler targets, WATSON-Net achieves a recall-at-precision of 0.99 (R@P0.99) of 0.903, ranking second, with only the ExoMiner network performing better (R@P0.99 = 0.936). For TESS signals, WATSON-Net emerges as the best-performing non-fine-tuned machine learning classifier, achieving a precision of 0.93 and a recall of 0.76 on a test set comprising confirmed planets and false positives. Both the model and its data preparation tools are publicly available in the dearwatson Python package, fully open-source and integrated into the vetting engine of the SHERLOCK pipeline.
- North America > United States (0.14)
- Europe > Spain > Galicia > A Coruña Province > A Coruña (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
RGC: a radio AGN classifier based on deep learning. I. A semi-supervised model for the VLA images of bent radio AGNs
Hossain, M. S., Shahal, M. S. H., Khan, A., Asad, K. M. B., Saikia, P., Akter, F., Ali, A., Amin, M. A., Momen, A., Hasan, M., Rahman, A. K. M. M.
Wide-angle tail (WAT) and narrow-angle tail (NAT) radio active galactic nuclei (RAGNs) are key tracers of dense environments in galaxy groups and clusters, yet no machine-learning classifier of bent RAGNs has been trained using both unlabeled data and purely visually inspected labels. We release the RGC Python package, which includes two newly preprocessed labeled datasets of 639 WATs and NATs derived from a publicly available catalog of visually inspected sources, along with a semi-supervised RGC model that leverages 20,000 unlabeled RAGNs. The two labeled datasets in RGC were preprocessed using PyBDSF which retains spurious sources, and Photutils which removes them. The RGC model integrates the self-supervised framework BYOL (Bootstrap YOur Latent) with the supervised E2CNN (E2-equivariant Convolutional Neural Network) to form a semi-supervised binary classifier. The RGC model, when trained and evaluated on a dataset devoid of spurious sources, reaches peak performance, attaining an accuracy of 88.88% along with F1-scores of 0.90 for WATs and 0.85 for NATs. The model's attention patterns amid class imbalance suggest that this work can serve as a stepping stone toward developing physics-informed foundation models capable of identifying a broad range of AGN physical properties.
- Oceania > Australia (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Africa > South Africa (0.04)
- (5 more...)
Extracting latent representations from X-ray spectra. Classification, regression, and accretion signatures of Chandra sources
Vago, Nicolò Oreste Pinciroli, Martínez-Galarza, Juan Rafael, Amato, Roberta
The study of X-ray spectra is crucial to understanding the physical nature of astrophysical sources. Machine learning methods can extract compact and informative representations of data from large datasets. The Chandra Source Catalog (CSC) provides a rich archive of X-ray spectral data, which remains largely underexplored in this context. This work aims to develop a compact and physically meaningful representation of Chandra X-ray spectra using deep learning. To verify that the learned representation captures relevant information, we evaluate it through classification, regression, and interpretability analyses. We use a transformer-based autoencoder to compress X-ray spectra. The input spectra, drawn from the CSC, include only high-significance detections. Astrophysical source types and physical summary statistics are compiled from external catalogs. We evaluate the learned representation in terms of spectral reconstruction accuracy, clustering performance on 8 known astrophysical source classes, and correlation with physical quantities such as hardness ratios and hydrogen column density ($N_H$). The autoencoder accurately reconstructs spectra with 8 latent variables. Clustering in the latent space yields a balanced classification accuracy of $\sim$40% across the 8 source classes, increasing to $\sim$69% when restricted to AGNs and stellar-mass compact objects exclusively. Moreover, latent features correlate with non-linear combinations of spectral fluxes, suggesting that the compressed representation encodes physically relevant information. The proposed autoencoder-based pipeline is a powerful tool for the representation and interpretation of X-ray spectra, providing a compact latent space that supports both classification and the estimation of physical properties. This work demonstrates the potential of deep learning for spectral studies and uncovering new patterns in X-ray data.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- (3 more...)
Learning novel representations of variable sources from multi-modal $\textit{Gaia}$ data via autoencoders
Huijse, P., De Ridder, J., Eyer, L., Rimoldini, L., Holl, B., Chornay, N., Roquette, J., Nienartowicz, K., de Fombelle, G. Jevardat, Fritzewski, D. J., Kemp, A., Vanlaer, V., Vanrespaille, M., Wang, H., Carnerero, M. I., Raiteri, C. M., Marton, G., Madarász, M., Clementini, G., Gavras, P., Aerts, C.
Gaia Data Release 3 (DR3) published for the first time epoch photometry, BP/RP (XP) low-resolution mean spectra, and supervised classification results for millions of variable sources. This extensive dataset offers a unique opportunity to study their variability by combining multiple Gaia data products. In preparation for DR4, we propose and evaluate a machine learning methodology capable of ingesting multiple Gaia data products to achieve an unsupervised classification of stellar and quasar variability. A dataset of 4 million Gaia DR3 sources is used to train three variational autoencoders (VAE), which are artificial neural networks (ANNs) designed for data compression and generation. One VAE is trained on Gaia XP low-resolution spectra, another on a novel approach based on the distribution of magnitude differences in the Gaia G band, and the third on folded Gaia G band light curves. Each Gaia source is compressed into 15 numbers, representing the coordinates in a 15-dimensional latent space generated by combining the outputs of these three models. The learned latent representation produced by the ANN effectively distinguishes between the main variability classes present in Gaia DR3, as demonstrated through both supervised and unsupervised classification analysis of the latent space. The results highlight a strong synergy between light curves and low-resolution spectral data, emphasising the benefits of combining the different Gaia data products. A two-dimensional projection of the latent variables reveals numerous overdensities, most of which strongly correlate with astrophysical properties, showing the potential of this latent space for astrophysical discovery. We show that the properties of our novel latent representation make it highly valuable for variability analysis tasks, including classification, clustering and outlier detection.
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (7 more...)
The hunt for new pulsating ultraluminous X-ray sources: a clustering approach
Vago, Nicolò Oreste Pinciroli, Amato, Roberta, Imbrogno, Matteo, Israel, GianLuca, Belfiore, Andrea, Kovlakas, Konstantinos, Fraternali, Piero, Pasquato, Mario
The discovery of fast and variable coherent signals in a handful of ultraluminous X-ray sources (ULXs) testifies to the presence of super-Eddington accreting neutron stars, and drastically changed the understanding of the ULX class. Our capability of discovering pulsations in ULXs is limited, among others, by poor statistics. However, catalogues and archives of high-energy missions contain information which can be used to identify new candidate pulsating ULXs (PULXs). The goal of this research is to single out candidate PULXs among those ULXs which have not shown pulsations due to an unfavourable combination of factors. We applied an AI approach to an updated database of ULXs detected by XMM-Newton. We first used an unsupervised clustering algorithm to sort out sources with similar characteristics into two clusters. Then, the sample of known PULX observations has been used to set the separation threshold between the two clusters and to identify the one containing the new candidate PULXs. We found that only a few criteria are needed to assign the membership of an observation to one of the two clusters. The cluster of new candidate PULXs counts 85 unique sources for 355 observations, with $\sim$85% of these new candidates having multiple observations. A preliminary timing analysis found no new pulsations for these candidates. This work presents a sample of new candidate PULXs observed by XMM-Newton, the properties of which are similar (in a multi-dimensional phase space) to those of the known PULXs, despite the absence of pulsations in their light curves. While this result is a clear example of the predictive power of AI-based methods, it also highlights the need for high-statistics observational data to reveal coherent signals from the sources in this sample and thus validate the robustness of the approach.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States (0.14)
- Asia > Middle East > Israel (0.05)
- (3 more...)
A transformer-based generative model for planetary systems
Alibert, Yann, Davoult, Jeanne, Marques, Sara
Numerical calculations of planetary system formation are very demanding in terms of computing power. These synthetic planetary systems can however provide access to correlations, as predicted in a given numerical framework, between the properties of planets in the same system. Such correlations can, in return, be used in order to guide and prioritize observational campaigns aiming at discovering some types of planets, as Earth-like planets. Our goal is to develop a generative model which is capable of capturing correlations and statistical relationships between planets in the same system. Such a model, trained on the Bern model, offers the possibility to generate large number of synthetic planetary systems with little computational cost, that can be used, for example, to guide observational campaigns. Our generative model is based on the transformer architecture which is well-known to efficiently capture correlations in sequences and is at the basis of all modern Large Language Models. To assess the validity of the generative model, we perform visual and statistical comparisons, as well as a machine learning driven tests. Finally, as a use case example, we consider the TOI-469 system, in which we aim at predicting the possible properties of planets c and d, based on the properties of planet b (the first that has been detected). We show using different comparison methods that the properties of systems generated by our model are very similar to the ones of the systems computed directly by the Bern model. We also show in the case of the TOI-469 system, that using the generative model allows to predict the properties of planets not yet observed, based on the properties of the already observed planet. We provide our model to the community on our website www.ai4exoplanets.com.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Accelerating exoplanet climate modelling: A machine learning approach to complement 3D GCM grid simulations
Plaschzug, Alexander, Reza, Amit, Carone, Ludmila, Gernjak, Sebastian, Helling, Christiane
With the development of ever-improving telescopes capable of observing exoplanet atmospheres in greater detail and number, there is a growing demand for enhanced 3D climate models to support and help interpret observational data from space missions like CHEOPS, TESS, JWST, PLATO, and Ariel. However, the computationally intensive and time-consuming nature of general circulation models (GCMs) poses significant challenges in simulating a wide range of exoplanetary atmospheres. This study aims to determine whether machine learning (ML) algorithms can be used to predict the 3D temperature and wind structure of arbitrary tidally-locked gaseous exoplanets in a range of planetary parameters. A new 3D GCM grid with 60 inflated hot Jupiters orbiting A, F, G, K, and M-type host stars modelled with Exorad has been introduced. A dense neural network (DNN) and a decision tree algorithm (XGBoost) are trained on this grid to predict local gas temperatures along with horizontal and vertical winds. To ensure the reliability and quality of the ML model predictions, WASP-121 b, HATS-42 b, NGTS-17 b, WASP-23 b, and NGTS-1 b-like planets, which are all targets for PLATO observation, are selected and modelled with ExoRad and the two ML methods as test cases. The DNN predictions for the gas temperatures are to such a degree that the calculated spectra agree within 32 ppm for all but one planet, for which only one single HCN feature reaches a 100 ppm difference. The developed ML emulators can reliably predict the complete 3D temperature field of an inflated warm to ultra-hot tidally locked Jupiter around A to M-type host stars. It provides a fast tool to complement and extend traditional GCM grids for exoplanet ensemble studies. The quality of the predictions is such that no or minimal effects on the gas phase chemistry, hence on the cloud formation and transmission spectra, are to be expected.
- Europe > Austria > Styria > Graz (0.04)
- North America > United States > Montana > Roosevelt County (0.04)
- Europe > Austria > Vienna (0.04)
Understanding molecular ratios in the carbon and oxygen poor outer Milky Way with interpretable machine learning
Vermariën, Gijs, Viti, Serena, Heyl, Johannes, Fontani, Francesco
Context. The outer Milky Way has a lower metallicity than our solar neighbourhood, but still many molecules are detected in the region. Molecular line ratios can serve as probes to better understand the chemistry and physics in these regions. Aims. We use interpretable machine learning to study 9 different molecular ratios, helping us understand the forward connection between the physics of these environments and the carbon and oxygen chemistries. Methods. Using a large grid of astrochemical models generated using UCLCHEM, we study the properties of molecular clouds of low oxygen and carbon initial abundance. We first try to understand the line ratios using a classical analysis. We then move on to using interpretable machine learning, namely Shapley Additive Explanations (SHAP), to understand the higher order dependencies of the ratios over the entire parameter grid. Lastly we use the Uniform Manifold Approximation and Projection technique (UMAP) as a reduction method to create intuitive groupings of models. Results. We find that the parameter space is well covered by the line ratios, allowing us to investigate all input parameters. SHAP analysis shows that the temperature and density are the most important features, but the carbon and oxygen abundances are important in parts of the parameter space. Lastly, we find that we can group different types of ratios using UMAP. Conclusions. We show the chosen ratios are mostly sensitive to changes in the carbon initial abundance, together with the temperature and density. Especially the CN/HCN and HNC/HCN ratio are shown to be sensitive to the initial carbon abundance, making them excellent probes for this parameter. Out of the ratios, only CS/SO shows a sensitivity to the oxygen abundance.
- Europe > Netherlands > South Holland > Leiden (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (8 more...)
Latent-space Field Tension for Astrophysical Component Detection An application to X-ray imaging
Guardiani, Matteo, Eberle, Vincent, Westerkamp, Margret, Rüstig, Julian, Frank, Philipp, Enßlin, Torsten
Modern observatories are designed to deliver increasingly detailed views of astrophysical signals. To fully realize the potential of these observations, principled data-analysis methods are required to effectively separate and reconstruct the underlying astrophysical components from data corrupted by noise and instrumental effects. In this work, we introduce a novel multi-frequency Bayesian model of the sky emission field that leverages latent-space tension as an indicator of model misspecification, enabling automated separation of diffuse, point-like, and extended astrophysical emission components across wavelength bands. Deviations from latent-space prior expectations are used as diagnostics for model misspecification, thus systematically guiding the introduction of new sky components, such as point-like and extended sources. We demonstrate the effectiveness of this method on synthetic multi-frequency imaging data and apply it to observational X-ray data from the eROSITA Early Data Release (EDR) of the SN1987A region in the Large Magellanic Cloud (LMC). Our results highlight the method's capability to reconstruct astrophysical components with high accuracy, achieving sub-pixel localization of point sources, robust separation of extended emission, and detailed uncertainty quantification. The developed methodology offers a general and well-founded framework applicable to a wide variety of astronomical datasets, and is therefore well suited to support the analysis needs of next-generation multi-wavelength and multi-messenger surveys.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (3 more...)
- Government (0.46)
- Energy (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Data Science (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)